Skip to content

Conversation

danehans
Copy link
Contributor

What type of PR is this?
/kind test
/area conformance-test

What this PR does / why we need it:

Adds a weight-based traffic splitting test to ensure implementations properly balance traffic across multiple InferencePool backendRefs that include a weight definition.

Which issue(s) this PR fixes:
Fixes #1668

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot
Copy link
Contributor

@danehans: The label(s) kind/test, area/conformance-test cannot be applied, because the repository doesn't have them.

In response to this:

What type of PR is this?
/kind test
/area conformance-test

What this PR does / why we need it:

Adds a weight-based traffic splitting test to ensure implementations properly balance traffic across multiple InferencePool backendRefs that include a weight definition.

Which issue(s) this PR fixes:
Fixes #1668

Does this PR introduce a user-facing change?:

NONE

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

netlify bot commented Sep 30, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 9de9d4b
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68dc17b2d73a740008cd3527
😎 Deploy Preview https://deploy-preview-1669--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 30, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danehans

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 30, 2025
@danehans
Copy link
Contributor Author

cc: @robscott @zetxqx

Copy link
Contributor

@zetxqx zetxqx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much!

W/ one of the comments I proposed to reduce flakiness I can always(5/5) pass this test using GKE.

    gateway_weighted_two_pools.go:233: Weighted split OK: primary=0.745 (hits=149/200), expected=0.700, tolerance=±0.100; secondary hits=51
    apply.go:283: 2025-10-07T01:46:48.661093085Z: Deleting httproute-weighted-two-pools HTTPRoute
=== RUN   TestConformance/HTTPRouteInvalidInferencePoolRef
    conformance.go:68: Skipping HTTPRouteInvalidInferencePoolRef: test explicitly skipped
=== RUN   TestConformance/HTTPRouteMultipleGatewaysDifferentPools
    conformance.go:68: Skipping HTTPRouteMultipleGatewaysDifferentPools: test explicitly skipped
=== RUN   TestConformance/InferencePoolAccepted
    conformance.go:68: Skipping InferencePoolAccepted: test explicitly skipped
=== RUN   TestConformance/InferencePoolHTTPRoutePortValidation
    conformance.go:68: Skipping InferencePoolHTTPRoutePortValidation: test explicitly skipped
=== RUN   TestConformance/InferencePoolInvalidEPPService
    conformance.go:68: Skipping InferencePoolInvalidEPPService: test explicitly skipped
=== RUN   TestConformance/HTTPRouteMultipleRulesDifferentPools
    conformance.go:68: Skipping HTTPRouteMultipleRulesDifferentPools: test explicitly skipped
=== RUN   TestConformance/InferencePoolResolvedRefsCondition
    conformance.go:68: Skipping InferencePoolResolvedRefsCondition: test explicitly skipped
--- PASS: TestConformance (176.93s)
    --- SKIP: TestConformance/EppUnAvailableFailOpen (0.00s)
    --- SKIP: TestConformance/GatewayFollowingEPPRouting (0.00s)
    --- PASS: TestConformance/GatewayWeightedAcrossTwoInferencePools (172.38s)
    --- SKIP: TestConformance/HTTPRouteInvalidInferencePoolRef (0.00s)
    --- SKIP: TestConformance/HTTPRouteMultipleGatewaysDifferentPools (0.00s)
    --- SKIP: TestConformance/InferencePoolAccepted (0.00s)
    --- SKIP: TestConformance/InferencePoolHTTPRoutePortValidation (0.00s)
    --- SKIP: TestConformance/InferencePoolInvalidEPPService (0.00s)
    --- SKIP: TestConformance/HTTPRouteMultipleRulesDifferentPools (0.00s)
    --- SKIP: TestConformance/InferencePoolResolvedRefsCondition (0.00s)
PASS
ok  	sigs.k8s.io/gateway-api-inference-extension/conformance	177.134s

Comment on lines +146 to +152
primarySet := func() map[string]struct{} {
m := make(map[string]struct{}, len(primaryPodNames))
for _, n := range primaryPodNames {
m[n] = struct{}{}
}
return m
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not remove the function wrap? is this for a more strict scoping?

"model": "conformance-fake-model",
"prompt": "Write as if you were a critic: San Francisco"
}`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you like to add something similar to this to reduce the falkiness? I'm using GKE setup however this single test may fail sometime. But added the following code, it always passed for GKE.

		// Provide a union list of eligible endpoints for the test. Each pool's EPP
		// should filter to endpoints that actually belong to its pool.
		allIPs := append(append([]string{}, primaryPodIPs...), secondaryPodIPs...)
		allIPNames := append(append([]string{}, primaryPodNames...), secondaryPodNames...)
		eppHeaderValue := strings.Join(allIPs, ",")

		requestBody := `{
			"model": "conformance-fake-model",
			"prompt": "Write as if you were a critic: San Francisco"
		}`

		for i := 0; i < len(allIPs); i++ {
			// Send an initial request targeting a single pod and wait for it to be successful to ensure the Gateway and EPP
			// are functioning correctly before running the main test cases.
			traffic.MakeRequestAndExpectSuccess(
				t,
				s.RoundTripper,
				s.TimeoutConfig,
				gwAddr,
				traffic.Request{
					Host: hostname,
					Path: path,
					Headers: map[string]string{
						test.HeaderTestEppEndPointSelectionKey: allIPs[i],
					},
					Method:    http.MethodPost,
					Body:      requestBody,
					Backend:   allIPNames[i],
					Namespace: resources.AppBackendNamespace,
				},
			)
		}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Conformance: Test Weight-Based InferencePool Traffic Splitting
3 participants